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Abstract — Compressed sensing is designed to measure sparse 
signals directly in a compressed form. However, most signals 
of interest are only "approximately sparse", i.e. even though 
the signal contains only a small fraction of relevant (large) 
components the other components are not strictly equal to 
zero, but are only close to zero. In this paper we model the 
approximately sparse signal with a Gaussian distribution of small 
components, and we study its compressed sensing with dense 
random matrices. We use replica calculations to determine the 
mean-squared error of the Bayes-optimal reconstruction for such 
signals, as a function of the variance of the small components, 
the density of large components and the measurement rate. We 
then use the G-AMP algorithm and we quantify the region of 
parameters for which this algorithm achieves optimality (for 
large systems). Finally, we show that in the region where the G- 
AMP for the homogeneous measurement matrices is not optimal, 
a special "seeding" design of a spatially-coupled measurement 
matrix allows to restore optimality. 

I. Introduction 

Compressed sensing is designed to measure sparse signals 
directly in a compressed form. It does so by acquiring a 
small number of random linear projections of the signal 
and subsequently reconstructing the signal. The interest in 
compressed sensing was boosted by works (TJ, that showed 
that this reconstruction is computationally feasible in many 
cases. Often in the studies of compressed sensing the authors 
require that the reconstruction method works with guarantees 
for an arbitrary signal. This requirement then has to be paid 
by higher measurement rates than would be necessary if the 
probabilistic properties of the signal were at least approxi- 
mately known. In many situations where compressed sensing 
is of practical interest there is a good knowledge about the 
statistical properties of the signal. In the present paper we 
will treat this second case. 

It has been shown recently 0, that for compressed 
sensing of sparse signals with known empirical distribution 
of components the theoretically optimal reconstruction can be 
achieved with the combined use of G-AMP algorithm 0, 
and seeding (spatially coupled) measurement matrices. Actu- 
ally, [3] argued that for noiseless measurements the knowledge 
of the signal distribution is not even required. It is well 
known that for noiseless measurements exact reconstruction 
is possible down to measurement rates equal to the density of 
non-zero components in the signal. For noisy measurements 



the optimal achievable mean-squared error (MSE) has been 
analyzed and compared to the performance of G-AMP in 0. 

In its most basic form compressed sensing is designed 
for sparse signals, but most signals of interest are only 
"approximately sparse", i.e. even though the signal contains 
only a small fraction of relevant (large) components the other 
components are not strictly equal to zero, but are only close 
to zero. In this paper we model the approximately sparse 
signal by the two-Gaussian distribution as in [8]. We study the 
optimal achievable MSE in the reconstruction of such signals 
and compare it to the performance of G-AMP algorithm using 
its asymptotic analysis - the state evolution [5|, [6|, [7]. Even- 
though we limit ourselves to this special class of signals 
and assume their knowledge, many qualitative features of our 
results stay true for other signals and when the distribution of 
signal-components is not known as well. 

A. Definitions 

We study compressed sensing for approximately sparse 
signals. The TV-dimensional signals that we consider have 
iid components, K of these components are chosen from a 
distribution 4>(x), we define the density p — K/N, and the 
remaining N — K components are Gaussian with zero mean 
and small variance e 

N 

P(x)=l[[p<t>(x i ) + (l-p)M(0,e)} (1) 

Of course no real signal of interest is iid, however, for the 
same reasons as in our analysis applies also to non-iid 
signals with empirical distribution of components converging 
to p4>(x) + (1 — p)JV(0, e). For concreteness our numerical 
examples are for Gaussian cf>(x) of zero mean and unit 
variance. Although the numerical results depend on the form 
of 4>{x), the overall picture is robust with respect to this choice. 
We further assume that the parameters of P(x) are known or 
can be learned via expectation-maximization-type of approach. 

We obtain M measurements y M as linear projections of a 
iV-components signal 

N 

V»=J2 F » iSi ' H = 1,..-,M. (2) 

i=l 



The MxN measurement matrix is denoted i 7 ^ . For simplicity 
we assume the measurements to be noiseless, the case of noisy 
measurements can be treated in the same way as in 0. As 
done traditionally, in the first part of this paper, we consider 
the measurement matrix having iid components of zero mean 
and variance 1/N. In our numerical experiments we chose 
the components of the matrix to be normally distributed, but 
the asymptotic analysis does not depend on the details of the 
components distribution. The seeding measurement matrices 
considered in the second part of this paper will be defined in 
detail in section HVl 

The goal of compressed sensing is to reconstruct the signal s 
based on the knowledge of M measurements y and the M x 
N matrix F. We define a — Al/N to be the measurement 
(sampling) rate. The Bayes optimal way of estimating a signal 
x* that minimizes the MSE E = J2iLi( s * ~ X *) 2 / N with the 
true signal s is given as 

x* = J dx i x l Vi{xi) , (3) 

where Vi(xi) is the marginal probability distribution of the 
variable i 

Vi{xi) = [ P(x|y)J]d^ (4) 
under the posterior measure 

M N 

In this paper we will use an asymptotic replica analysis of this 
optimal Bayes reconstruction, which allows to compute the 
MSE as function of the parameters of the signal distribution, 
p and e, and of the measurement rate a. 

Of course optimal Bayes reconstruction is not computation- 
ally tractable. In order to get an estimate of the marginals 
Vi(xi), we use the G-AMP algorithm that is a belief- 
propagation based algorithm. 

B. Related works 

The l\ -minimization based algorithms 0, are widely 
used for compressed sensing of approximately sparse signals. 
They are very general and provide good performances in many 
situations. They, however, do not achieve optimal reconstruc- 
tion when the statistical properties of the signal are known. 

The two-Gaussian model for approximately sparse signal 
eq. (JT} was used in compressed sensing e.g. in JSJ, (5). 

Belief propagation based reconstruction algorithms were 
introduced in compressed sensing by 0. Authors of [8 | used 
sparse measurement matrices and treated the BP messages as 
probabilities over real numbers, that were represented by a 
histogram. The messages, however, can be represented only 
by their mean and variance as done by 1101 . ATI . Moreover, 
one does not need to send messages between every signal- 
components and every measurements [12], this leads to the 
approximate message passing (AMP). In the context of physics 
of spin glasses this transformation of the belief propagation 



equations corresponds to the Thouless-Anderson-Palmer equa- 
tions [ 13 1. The AMP was generalized for general signal models 
in 0, and called G-AMP. The algorithm used in 0, 
is equivalent to G-AMP. We also want to note that we find 
the name "approximate" message passing a little misleading 
since, as argued e.g. in 0, for dense random measurement 
matrices the G-AMP is asymptotically equivalent to BP, i.e. 
all the leading terms in TV are included in G-AMP. 

For random matrices the evolution of iterations of G-AMP 
on large system sizes is described by state evolution [ 12 1. The 
exactness of this description was proven in large generality in 
fl4l . See also 1151 . 0, for discussions and results on the 
state evolution. 

The optimal reconstruction was studied extensively in lfl6l . 
The replica method was used to analyse the optimal recon- 
struction in compressed sensing in e.g. ifTTl . fl8l . In the 
statistical physics point of view the replica method is closely 
related to the state evolution 0. 

As we shall see the G-AMP algorithm for homogeneous 
measurement matrices matches asymptotically the perfor- 
mance of the optimal reconstruction in a large part of the 
parameter space. In some region of parameters, however, it is 
suboptimal. For the sparse signals, it was demonstrated heuris- 
tically in that optimality can be restored using seeding 
matrices (the concept is called spatial coupling), rigorous proof 
of this was worked out in (4 |. The robustness to measurement 
noise was also discussed in 0, 0. Note that the concept 
of "spatial coupling" thanks to which theoretical thresholds 
can be saturated was developed in error-correcting codes 1 19 1, 
1 20], [ 2 1 1 . In compressed sensing the "spatial coupling" was 
first tested in [9] who did not observe any improvement 
for the two-Gaussian model for reasons that we will clarify 
later in this paper. Basically, the spatial coupling provides 
improvements only if a first order phase transition is present, 
but for the variance of small components that was tested in [9] 
there is no such transition: it appears only for slightly smaller 
values of the variance. 

C. Our Contribution 

Using the replica method, we study the MSE in optimal 
Bayes inference of approximately sparse signals. In parallel, 
we study the asymptotic (large N) performance of G-AMP 
using state evolution. The parameters that we vary are the 
density of large components p = K/N, the variance of the 
small components e and the sampling rate a = Al/N. 

More precisely, we show that for a fixed signal density p, for 
low variance of the small components e < e(p), the optimal 
Bayes reconstruction has a transition at a critical value a = 
a op t, separating a phase with a small value (comparable to e) 
of the MSE, obtained at a > a op t, from a phase with a large 
value of the MSE, obtained at a < a opt . This is a "first order" 
phase transition, in the sense that the MSE is discontinuous at 
a = a op t- 

The G-AMP algorithm exhibits a double phase transition. 
It is asymptotically equivalent to the optimal Bayes inference 
at large asp < a < 1, where it matches the optimal 



reconstruction with a small value of the MSE. At low values of 
a < a op t the G-AMP is also asymptotically equivalent to the 
optimal Bayes inference but in this low-sampling-rate region 
the optimal result leads to a large MSE. In the intermediate 
region a op t < a < q;bp G-AMP leads to large MSE, but the 
optimal Bayes inference leads to low MSE. This is the region 
where one needs to improve on G-AMP. We show that in this 
intermediate region the G-AMP performance can be improved 
with the use of seeding (spatially coupled) measurement 
matrices, and with a proper choice of the parameters of these 
matrices one can approach the performance of the optimal 
Bayes inference in the large system size limit. 

Finally for higher variance of the small components e > 
e(p) there is no phase transition for < a < 1. In this regime, 
G-AMP achieves optimal Bayes inference and the MSE that 
it obtains varies continuously from at a = 1 to large values 
at low measurement rate a. 



II. Bayes optimal and G-AMP reconstruction of 

APPROXIMATELY SPARSE SIGNALS 



If the only available information about the signal is the ma- 
trix F and the vector of measurements y then the information- 
theoretically best possible estimate of each signal component 
is computed as a weighted average over all solutions of the 
linear system Q, where the weight of each solution is given 
by ([TJ. Of course, the undetermined linear system Q has 
exponentially many (in N) solutions and hence computing 
exactly the above weighted average is in general intractable. 

The corresponding expectation can be, however, approx- 
imated efficiently via the generalized approximate message 
passing (G-AMP) algorithm (6), 0, 0, Q that we recall 



in Sec. II-A The behavior of the algorithm in the limit of 
large system sizes can be analyzed via state evolution |14|, 
0, 0, as we recall in Sec. ITFE] 

The asymptotic performance of the optimal reconstruction 
can be analyzed via the replica method as in ifTTl . [18|, 
0, which is closely related to the state evolution of the G- 
AMP algorithm. We summarize the corresponding equations 
in Sec. llFCl 



A. Reminder of the G-AMP Algorithm 

The G-AMP is an iterative message passing algorithm. For 
every measurement component we define quantities and 
uj^, for each signal component quantities Ej, a^, Uj. In the 
G-AMP algorithm these quantities are updated as follows (for 
the derivation in the present notation see Q, for the original 



derivation 10, 0) 

v t+1 = Vf 2 w* 
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Here only the functions f a and fa depend in an explicit way 
on the signal model P(x). For the signal model ([Tji considered 
in this paper we have 
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(13) 



For the approximately sparse signal that we consider in this 
paper we have 



101= p, 
W-2 = 1 — p . 



(14) 
(15) 



A suitable initialization for the quantities is a\~ Q — 0, v\~ a = 
(l-p)e + pa 2 , w*=° = Vr 

Once the convergence of the iterative equations is reached, 
i.e. the quantities do not change anymore under iterations, the 
estimate of the ith signal component is a\. The MSE achieved 
by the algorithm is then E l — J2iLi( a i ~ s i) 2 1^ ■ 

B. Evolution of the algorithm 

In the limit of large system sizes, i.e. when parameters 
p, e, a are fixed whereas N — > oo, the evolution of the G-AMP 
algorithm can be described exactly using the "state evolution" 
fl4l . In the case where the signal model corresponds to the 
statistical properties of the actual signal, as it is the case in 
the present work, the state evolution is stated in terms of a 
single variable E*, the MSE at iteration-time t, which evolves 
as (for a derivation see e.g. lfT4l . 0) 



E t+1 = Tw n 
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(16) 
(17) 
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Fig. 1. Time-evolution of the MSE the G-AMP algorithm achieves (points) 
compared to the asymptotic N — > oo evolution obtained from the state 
evolution eq. [16\ (full lines). Data are obtained for a signal with density 
of large component p = 0.2, variance of the small components e = 10 — 6 . 
The algorithm was used for a signal of N = 3 • 10 4 components. 
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Fig. 2. The potential function Q(E) for signals of density p = 0.2, with 
variance of the small components t = 10~ 6 . The three lines depict the 
potential for three different measurement rates corresponding to the critical 
values: o B p = 0.3559, a op t = 0.2817, a s = 0.2305. The two local 
maxima exists for a £ (ce s ,CfBp)> at a pt the low MSE maxima becomes 
the global one. 



where Vz = e~ z I 2 j^phxdz is a Gaussian measure for the 
integral. The initialization corresponding to the one for the 
algorithm is E t=0 = (1 — p)e + pa 2 . 

In Fig. [TJ we plot the analytical prediction for the time 
evolution of the MSE computed from the state evolution (16) , 
and we compare it to the one measured in one run of the G- 
AMP algorithm for a system size N = 3 • 10 4 . The agreement 
for such system size is already excellent. 

C. Optimal reconstruction limit 

We notice that for some measurement rates a the state 



evolution equation (16i has two different stable fixed points. 
In particular, if the iterations are initialized with E — > 0, for 
certain values of a one will reach a fixed point with much 
lower MSE than initializing with large E. In fact, one of the 
fixed points determines the MSE that would be achieved by 
the exact Bayes optimal inference. This can be seen using 
the heuristic replica method that leads to asymptotically exact 
evaluation of the logarithm of the partition function Z in 
eq. d5}. In general, if the partition function can be evaluated 
precisely then the expectations x* eq. Q and the associated 
MSE of the optimal inference can be computed. 

The replica analysis, derived for the present problem e.g. in 
JT), shows that the large N limit of log Z/N is equal to the 
global maximum of the following "potential" function 
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(18) 



Note that the state evolution corresponds to the steepest ascent 
of <&{E>)- When $(£7) has two local maxima then the fixed 



point of ( [To} depends on the initial condition. 

In Fig.Tfrwe plot the function $(£') for a signal of density 
p = 0.2, variance of small components e = 10~ 6 and three 
different values of the measurement rate a. We define three 
phase transitions 

• Qbp is defined as the largest a for which the potential 
function $>(E) has two local maxima. 

• a s is defined as the smallest a for which the potential 
function < E > (£') has two local maxima. 

• a opt is defined as the value of a for which the two 
maxima have the same height. 

III. Phase diagrams for approximate sparsity 

In Fig. [3]we plot the MSE to which the state evolution con- 
verges if initialized at large value of MSE - such initialization 
corresponds to the iterations of G-AMP when the actual signal 
is not known. For e = 0.01 we also compare explicitly to a run 
of G-AMP for system size of N = 3 • 10 4 . Depending on the 
value of density p, and variance e, two situations are possible: 
For relatively large e, as the measurement rate a decreases 
the final MSE grows continuously from E = at a = 1 to 
E = E t=a at a = 0. For lower values of e the MSE achieved 
by G-AMP has a discontinuity at «bp at which the second 
maxima of $>(E) appears. Note that the case of e — 0.01 was 
tested in J8), the case of e = 0.0025 in 0. 

In Fig. |4] we plot in full blue line the MSE to which 
the G-AMP converges and compare to the MSE achieved 
by the optimal Bayes inference, i.e. the MSE corresponding 
to the global maximum of $(£?) (in dashed red line). We 
see that, when the discontinuous transition point «bp exists, 
then in the region a opt < a < q;bp G-AMP is suboptimal. 
We remind that, in the limite — > 0, exact reconstruction is 
possible for any a > p. We see that for a < a op t and for 
a > aBP the performance of G-AMP matches asymptotically 
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Fig. 3. The MSE achieved by the G-AMP. The lines correspond to the 
evaluation of the MSE from the state evolution, the data points to the MSE 
achieved by the G-AMP algorithm for N = 3 • 10 4 . The data are for signals 
with density p = 0.1 and several values of variance of small components e 
as a function of the measurement rate a. The MSE grows continuously as a 
decreases for e > 0.00075. For smaller values of the noise a first order phase 
transition is present and the MSE jumps discontinuously at «bp(c). 
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Fig. 4. MSE achieved by the G-AMP (blue solid lines) compared to the 
MSE achieved by the Bayes-optimal inference (red dashed lines) as evaluated 
using the state evolution. The data points correspond to the MSE achieved by 
the G-AMP algorithm for N = 3 ■ 10 4 . The optimal MSE jumps at a pt- 
Hence for e < 0.00075 there is a range of measurement rates (a pt,a BP ) 
in which the BP is asymptotically suboptimal. 



the performance of the optimal Bayes inference. The two 
regions are, however, quite different. For a < a op t the final 
MSE is relatively large, whereas for a > q;bp the final MSE 
is of order e and hence is this region the problem shows a 
very good stability towards approximate sparsity. 

In Fig. |5]we summarize the critical values of asp and a op t 
for a signal of density p = 0.1 as a function of the variance of 
the small components e. Note that for e > 0.00075 (the value 
depends on p) there are no phase transitions, hence for this 
large value of e, the G-AMP algorithm matches asymptotically 
the optimal Bayes inference. Note that in the limit of exactly 
sparse signal e —> the values a opt —> p, and a s —> p. 



Fig. 5. Phase diagram for compressed sensing of approximately sparse 
signals. The density of the large signal components is p = 0.1, we are 
changing measurement rate a and variance of small components e. The critical 
values of measurement rates a pt> «bp and a s are plotted. For homogeneous 
measurement matrices, BP does not achieve optimal reconstruction in the area 
between a Q pt (red) and obp (blue). 



Whereas a B p(e -> 0) -> 0.2076, hence for a > 0.2076 the 
G-AMP algorithm is very robust with respect to appearance 
of approximate sparsity since the transition «bp has a very 
weak e-dependence, as seen in Fig. [5] 

In Fig. [6] we plot the phase diagram for fixed variance e 
in the density p - measurement rate a plane. The only space 
for improvement is in the region a opt < a < asp- In this 
region, G-AMP is not optimal because the potential Q(E) has 
two maxima, and the iterations are blocked in the "wrong" 
local maximum of the potential with the largest E. This 

situation is well known in physics as a first order transition, 
with a blocking of the dynamics in a metastable state. 

IV. Reconstruction of approximately sparse 

SIGNALS WITH OPTIMALITY ACHIEVING MATRICES 

A first order phase transition that is causing a failure (sub- 
optimality) of the G-AMP algorithm appears also in the case 
of truly sparse signals J3). In that case [3 1 showed that with the 
so-called "seeding" or "spatially coupled" measurement ma- 
trices the G-AMP algorithm is able to restore asymptotically 
optimal performance. This was proven rigorously in |4|. Using 
arguments from the theory of crystal nucleation, it was argued 
heuristically in [ 3 1 that spatial coupling provides improvement 
whenever, but only if, a first order phase transition is present. 
Spatial coupling was first suggested for compressed sensing in 
[9 1 where the authors tested cases without a first order phase 
transition (see Fig. |3j, hence no improvement was observed. 
Here we show that for measurement rates a opt < a < ctep 
seeding matrices allow to restore optimality also for the 
inference of approximately sparse signals. 

A. Seeding sampling matrices 

The block measurement matrices that we use in the 
rest of this paper are constructed as follows: The N vari- 
ables are divided into L c equally sized groups. And the M 
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Fig. 6. Phase diagram in the plane density p measurement rate a, with variance of small components e = (left), e = 10 6 (center) and e = 10 4 (right). 



measurements are divided into L r groups of M soc d measure- 
ments in the first group and A/b u ik in the others. We define 

a scc d = L c M sccd /N and a bu ik = L c M hn]k /N. The total 
measurement rate is 

(L r - lWulk 



C^secd 



Lr 



(19) 



The matrix F is then composed of L r x L c blocks and the 
matrix elements are generated independently, in such a 
way that if \x is in group q and i in group p then is a 
random number with zero mean and variance J q . p /N. Thus 
we obtain a L r x L c coupling matrix J 9 P . For the asymptotic 
analysis we assume that N — > oo and a sec d, «buik are fixed. 
Note that not all block matrices are good seeding matrices, 
the parameters have to be set in such a way that seeding is 
implemented (i.e. existence of the seed and interactions such 
that the seed grows). The coupling measurement matrix J g p 
that we use in this paper is illustrated in Fig. [7] 
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Fig. 7. Sketch of the measurement matrices we used to approach optimal 
reconstruction with the G-AMP algorithm. We use a number of variable-blocks 
L c , and L r measurement blocks. The matrix components are iid with zero 
mean and variance 1/JV for the blocks on the diagonal and for a number W 
of lower diagonals, the upper diagonal blocks have components with variance 
J/N. 

To study the state evolution for the block matrices we define 



Ep to be the mean-squared error in block p at time t. The 
Ep +1 then depends on rhp from the same block according to 
eq. (16 1, on the other hand the quantity rh p depends on the 
MSE~~E* from all the blocks q = 1, . . . , L c as follows 
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r=2 L^q=\ J rq^q 



(20) 



This kind of evolution belongs to the class for which 
threshold saturation (asymptotic achievement of performance 
matching the optimal Bayes inference solver) was proven in 
Il22ll (when L c -> 00, W 00 and L c /W > 1). This 
asymptotic guarantee is reassuring, but one must check if 
finite N corrections are gentle enough to be able to perform 
compressed sensing close to a opt even for practical system 
sizes. We hence devote the next section to numerical exper- 
iments showing that the G-AMP algorithm is indeed able to 
reconstruct close to optimality with seeding matrices. 

B. Restoring optimality 

In Fig.[8]we show the state evolution compared to the evolu- 
tion of the G-AMP algorithm for system size N = 6- 10 4 . The 
signal was of density p = 0.2 and e = 10~ 6 , the parameters of 
the measurement matrix are in the second line of Table [I] the 
L c = 30 giving measurement rate a — 0.303 which is deep 
in the region where G-AMP for homogeneous measurement 
matrices gives large MSE (for a < 0.356). We see finite size 
fluctuations, but overall the evolution corresponds well to the 
asymptotic curve. 
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TABLE I 

Parameters of the seeding matrices used in Fig. [9] 



In Fig. [9] we plot the convergence time needed to achieve 
reconstruction with E w e for several sets of parameters of 
the seeding matrices. With a proper choice of the parameters, 
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Fig. 8. Evolution of the MSE in reconstruction of signal with density p = 
0.2, variance of small components e = 10 -6 at measurement rate a = 0.303. 
The state evolution on the top is compared to the evolution of the algorithm for 
a signal size N = 6- 10 4 on the bottom. The measurement is performed using 
a seeding matrix with the following parameters: a S ecd = 0.4, ctbulk = 0.29, 
W = 3, J = 0.2, L c = 30. 
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Fig. 9. The convergence time of BP for large system sizes estimated by 
the state evolution as a function of measurement rate a. Data are for signals 
with density p = 0.2, variance of small components e = 10 _B . The red line 
is obtained using an homogeneous measurement matrix, the vertical dashed 
line corresponds to the limit this approach can achieve c*bp = 0.3554. All 
the other lines are obtained using seeding matrix with parameters specified 
in Table [T] an d varying L c , the resulting measurement rate a is computed 
from eq. (19) . With these seeding matrices and using large L c , reconstruction 
is possible at least down to »bulk = 0-282 which is very close to the 
measurement rate a op t = 0.2817. The blue point corresponds to the evolution 
illustrated in Fig. [8] 



we see that we can reach an optimal reconstruction for values 
of a extremely close to a op t- Note, however, that the number 
of iterations needed to converge diverges as a — > a opt . This 
is very similar to what has been obtain in the case of purely 
sparse signals in [3|, |4|. 

Finally, it is important to point out that this theoretical 
analysis is valid for N — > oo only. Since we eventually work 
with finite size signals, in practice, finite size effects slightly 



degrade this asymptotic threshold saturation. This is a well 
known effect in coding theory where a major question is 
how to optimise finite-length codes (see for instance [23 1). In 
Fig. [10] we plot the fraction of cases in which the algorithm 
reached successful reconstruction for different system sizes as 
a function of the number of blocks L c . We see that for a given 
size as the number of blocks is growing, i.e. as the size of one 
block decreases, the performance deteriorates. As expected the 
situation improves when size augments. Analyses of the data 
presented in Fig. [10] suggest that the size of one block that 
is needed for good performance grows roughly linearly with 
the number of blocks L c . This suggests that the probability 
of failure to transmit the information to every new block is 
roughly inversely proportional to the block size. We let for 
future work a more detailed investigation of these finite size 
effects. The algorithm nevertheless reconstructs signals at rates 
close to the optimal one even for system sizes of practical 
interest. 




Fig. 10. Fraction of instances (over 20 attempts) that were solved by the 
algorithm in less than twice the number of iterations predicted by the density 
evolution for different system sizes, as a function of the number of blocks 
L c . We used the parameters that lead to the blue curve in Fig. [9] (i.e. second 
line of Table [TJ, As N — > oo, reconstruction is reached in all the instances, 
as predicted by the state evolution. For finite N, however, reconstruction is 
not reached when L c is too large. 



V. Discussion 

A. Mismatching signal model 

In this paper we treated signals generated by the two- 
Gaussian model and we assumed knowledge of the parameters 
used for the generation. Note, however, that in the same way 
as in [3|, [24], (7J the corresponding parameters (p, e, etc.) 
can be learned using expectation maximization. For real data 
it is also desirable to use a signal model ([TJ that is matching 
the data in a better way. 

At this point we want to state that whereas all our results 
do depend quantitatively on the statistical properties of the 
signal, the qualitative features of our results (e.g. the presence 
and nature of the phase transitions) are valid for other signals, 
distinct from the two-Gaussian case that we have studied 
here, and even for the case when the signal model does not 
match the statistical properties of the actual signal. This was 



illustrated e.g. for the noisy compressed sensing of truly sparse 
signal in Q. In the same line, we noticed and tested that if 
G-AMP corresponding to e = is run for the approximately 
sparse signals the final MSE is always larger than the one 
achieved by G-AMP with the right value of e. 

We tested the G-AMP algorithm with the signal model 
([TJ and EM learning of parameters on some real images 
and we indeed observed better performance than for the G- 
AMP designed for truly sparse signals. However, to become 
competitive we also need to find better models for the signal, 
likely including the fact that the sparse components are highly 
structured for real images. We let this for future work. 

B. Presence of noise 

In this paper we studied the case of noiseless measurements, 
but the measurement noise can be straightforwardly included 
into the analysis as in |[7| where we studied the phase diagram 
in the presence of the measurement noise. The results would 
change quantitatively, but not qualitatively. 

C. Computational complexity 

The G-AMP algorithm as studied here runs in O(MN) 
steps. For dense random matrices this cannot be improved, 
since we need M N steps to only read the components of the 
matrix. Improvements are possible for matrices that permit fast 
matrix operations, e.g. Fourrier or Gabor transform matrices 
11251 . Again, testing approximate sparsity in this case is an 
interesting direction for future research. Note, however, that 
the state evolution and replica analysis of optimality does not 
apply (at least not straightforwardly) to this case. 

Another direction to improve the running time of G-AMP 
is to sample the signal with sparse matrices as e.g. in (8). 
For sparse matrices G-AMP is not anymore asymptotically 
equivalent to the belief propagation (even though it can be a 
good approximation), and the full belief propagation is much 
harder to implement and to analyze. But despite this difficulty, 
it is an interesting direction to investigate. 

D. Spatial coupling 

For small variance of the small components of the signal the 
G-AMP algorithm for homogeneous matrices does not reach 
optimal reconstruction for measurement rates close to the 
theoretical limit a op t- The spatial coupling approach, resulting 
in the design of seeding matrices, improves significantly the 
performance. For diverging system sizes optimality can be 
restored. We show that significant improvement is also reached 
for sizes of practical interest. There are, however, significant 
finite size effect that should be studied in more detail. The 
optimal design of the seeding matrix for finite system sizes (as 
studied for instance in depth in the context of error correcting 
codes 1231 ) remains an important open question. 
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